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Abstract —We examine statistical pictures of violent conflicts 
over the last 2000 years, finding techniques for dealing with 
incompleteness and nnreUahility of historical data. 

We introduce a novel approach to apply extreme value theory 
to fat-tailed variables that have a remote, hut nonetheless finite 
upper hound, hy defining a corresponding unbounded dual 
distribution (given that potential war casualties are bounded by 
the world population). 

We apply methods from extreme value theory on the dual 
distribution and derive its tail properties. The dual method 
allows us to calculate the real mean of war casualties, which 
proves to be considerably larger than the sample mean, meaning 
severe underestimation of the tail risks of conflicts from naive 
observation. We analyze the robustness of our results to errors 
in historical reports, taking into account the unreliability of 
accounts by historians and absence of critical data. 

We study inter-arrival times between tail events and find that 
no particular trend can be asserted. 

All the statistical pictures obtained are at variance with the 
prevailing claims about "long peace", namely that violence has 
been declining over time. 


I. Introduction 

Since the middle of last century, there has been a multi¬ 
disciplinary interest in wars and armed conflicts (quantified in 
terms of casualties), see for example ||5l, ifTSl . 12^1 . Il30l . lITTI . 
EH, ffil and El. Studies have also covered the statistics 
of terrorism, for instance a, El, and the special issue of 
Risk Analysis on terrorism 1331. From a statistical point of 
view, recent contributions have attempted to show that the 
distribution of war casualties (or terrorist attacks’ victims) 
tends to have heavy tails, characterized by a power law 
decay a and m. Often, the analysis of armed conflicts 
falls within the broader study of violence 0, m, with 
the aim to understand whether we as human are more or 
less violent and aggressive than in the past and what role 
institutions played in that respect. Accordingly, the public 
intellectual arena has witnessed active debates, such as the 
one between Steven Pinker on one side, and John Gray on 
the other concerning the hypothesis that the long peace was 
a statistically established phenomenon or a mere statistical 
sampling error that is characteristic of heavy-tailed processes, 
m and fm -the latter of which is corroborated by this paper. 

Using a more recent data set containing 565 armed conflicts 
with more than 3000 casualties over the period 1-2015 AD, we 
confirm that the distribution of war casualties exhibits a very 
heavy right-tail. The tail is so heavy that — at first glance — 
war casualties could represent an infinite-mean phenomenon. 


as defined by Il26ll . But should the distribution of war casualties 
have an infinite mean, the annihilation of the human species 
would be just a matter of time, and the sample properties we 
can compute from data have no meaning at all in terms of 
statistical inference. In reality, a simple argument allows us 
to rule out the infiniteness of the mean: no event or series of 
events can kill more than the whole world population. The 
support of the distribution of war casualties is necessarily 
bounded, and the true mean cannot be infinite. 

Let [L,H\ be the support of the distribution of war casu¬ 
alties today. L cannot be smaller than 0, and we can safely 
fix it at some level L* >> 0 to ignore those small events 
that are not readily definable as armed conflict 1441 . As to H, 
its value cannot be larger than the world population, i.e. 7.2 
billion people in 201^ l40l . 

If Y is the random variable representing war casualties, its 
range of variation is very large and equal to H — L. Studying 
the distribution of Y can be difficult, given its bounded but 
extremely wide support. Finding the right parametric model 
among the family of possible ones is hard, while nonpara- 
metric approaches are more difficult to interpret from a risk 
analysis point of view. 

Our approach is to transform the data in order to apply the 
powerful tools of extreme value theory. Since iJ < oo we 
suggest a log-transformation of the data. This allows to use 
tools such as 11). Theoretical results such as the Pickands, 
Balkema and de Haan’s Theorem allow us to simplify the 
choice of the model to fit to our data. 

Let L and H be respectively the lower and the upper bound 
of a random variable Y, and define the function 

ip{Y) = L - Hlog ■ (1) 

It is easy to verify that 

1) ip is "smooth": p G C°°, 

2) :/?“^(oo) = if, 

3) p-\L) = p{L) = L. 

Then Z = piY) defines a new random variable with lower 
bound L and an infinite upper bound. Notice that the trans¬ 
formation induced by p{-) does not depend on any of the 
parameters of the distribution of Y. In what follows, we will 
call the distributions of Y and Z, respectively the real and the 
dual distribution. 

'Today’s world population can be taken as the upper bound, as never before 
humanity reached similar numbers. 


1 
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A. Actual War Casualties w.r.t. Population 
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C. Actual War Casualties w.r.t. Impact 
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B. Log-rescaled War Casualties w.r.t. Population 
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D. Log-rescaled War Casualties w.r.t. Impact 
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Fig. 1; War casualties over time, using raw (A,C) and dual (B,D) data. The size of each bubble represents the size of each 
event with respect to today’s world population (A,B) and with respect to the total casualties (raw; C, rescaled; D) in the data 
set. 


By studying the tail properties of the dual distribution (the 
one with an infinite upper bound), using extreme value theory, 
we will be able to obtain, by reverting to the real distribution, 
what we call the shadow mean of war casualties. We will show 
that this mean is at least 1.5 times larger than the sample mean, 
but nevertheless finite. 

We assume that many observations are missing from the 
dataset (from under-reported conflicts), and we base on analy¬ 
sis on the fact that war casualties are just imprecise estimates 
El, on which historians often have disputes, without anyone’s 
ability to verify the assessments using period sources. For 
instance, an event that took place in the eighth century, the An 
Lushan rebellion, is thought to have killed 13 million people, 
but there no precise or reliable methodology to allow us to 
trust that number -which could be conceivably one order of 
magnitude smaller|^ Using resampling techniques, we show 
that our tail estimates are robust to changes in the quality and 

^For a long time, an assessment of the drop in population in China was 
made on the basis of tax census, which might be attributable to a drop in the 
aftermath of the rebellion in surveyors and tax collectors. |4| 


reliability of data. Our results and conclusions will replicate 
even we missed a third of the data. When focusing on the more 
reliable set covering the last 500 years of data, one cannot 
observe any specific trend in the number of conflicts, as large 
wars appear to follow a homogeneous Poisson process. This 
memorylessness in the data conflicts with the idea that war 
violence has declined over time, as proposed by m or ll^ . 

The paper is organized as follows; Section describes our 
data set and analyses the most significant problems with the 
quality of observations; Section]^ is a descriptive analysis of 
data; it shows some basic result, which are already sufficient 
to refute the thesis m that we are living in a more peaceful 
world on the basis of statistical observations; Section m 
contains our investigations about the upper tail of the dual 
distribution of war casualties as well as discussions of tail 
risk; Section|V]deals with the estimation of the shadow mean; 
in Section |Vl] we discuss the robustness of our results to 
imprecision and errors in the data; Anally, Section |VII| looks 
at the number of conflicts over time, showing no visible trend 
in the last 500 years. 
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II. About the data 

The data set contains 565 events over the period 1-2015 
AD; an excerpt is shown in Table |I] Events are generally 
armed conflict^ such as interstate wars and civil wars, with 
a few exceptions represented by violence against citizens 
perpetrated by the bloodiest dictatorships, such as Stalin’s 
and Mao Zedong’s regimes. These were included in order 
to be consistent with previous works about war victims and 
violence, e.g. Eol and l32\ . 

We had to deal with the problem of inconsistency and lack 
of uniformity in the attribution of casualties by historians. 
Some events such as the siege of Jerusalem include death 
from famine, while for other wars only direct military victims 
are counted. It might be difficult to disentangle death from 
direct violence from those arising from such side effects as 
contagious diseases, hunger, rise in crime, etc. Nevertheless, 
as we show in our robustness analysis these inconsistencies 
do not affect out results —in contrast with analyses done on 
thin-tailed data (or data perceived to be so) where conclusions 
can be reversed on the basis of a few observations. 

The different sources for the data are; 0, Qsi, in, m, 
OOl . Il44l . Il45l and Il46ll . For websites like Necrometrics 
im . data have been double-checked against the cited refer¬ 
ences. The first observation in our data set is the Boudicca’s 
Revolt of 60-61 AD, while the last one is the international 
armed conflict against the Islamic State of Iraq and the Levant, 
still open. 

We include all events with more than 3000 casualties 
(soldiers and civilians) in absolute terms, without any rescaling 
with respect to the coeval world population. More details about 
rescaling in Subsection |II-D| The choice of the 3000 victims 
threshold was motivated by three main observations; 

• Conflicts with a high number of victims are more likely to 
be registered and studied by historians. While it is easier 
today to have reliable numbers about minor conflicts — 
although this point has been challenged by ll^ and llJTll 
— it is highly improbable to ferret out all smaller events 
that took place in the fifth century AD. In particular, 
a historiographical bias prevents us from accounting for 
much of the conflicts that took place in the Americas and 
Australia, before their discovery by European conquerors. 

• A higher threshold gives us a higher confidence about 
the estimated number of casualties, thanks to the larger 
number of sources, even if, for the bloodiest events 
of the far past, we must be careful about the possible 
exaggeration of numbers. 

• The object of our concern is tail risk. The extreme value 
techniques we use to study the right tail of the distribution 
of war casualties imposition of thresholds, actually even 
larger than 3000 casualties. 

To rescale conflicts and expressing casualties in terms of 
today’s world population (more details in Subsection |II-D| l, 

^We refer to the definition of o, according to which “An armed conflict 
is a contested incompatibility which concerns government and/or territory 
where the use of armed force between two parties, of which at least one is 
the government of a state, results in at least 25 battle-related deaths". This 
definition is also compatible with the Geneva Conventions of 1949. 


we relied on the population estimates of 1221 for the period 
1-1599 AD, and of lES) and EO) for the period 1650-2015 
AD. We used; 

• Century estimates from 1 AD to 1599. 

• Half-century estimates from 1600 to 1899. 

• Decennial estimates from 1900 to 1949. 

• Yearly estimates from 1950 until today. 


A. Data problems 


Accounts of war casualties are often anecdotal, spreading 
via citations, and based on vague estimates, without anyone’s 
ability to verify the assessments using period sources. For 
instance, the independence war of Algeria has various esti¬ 
mates, some from the French Army, others from the rebels, 
and nothing scientifically obtained lfT9]l . 

This can lead to several estimates for the same conflict, 
some more conservative and some less. Table [I) shows differ¬ 
ent estimates for most conflicts. In case of several estimates, 
we present the minimum, the average and the maximum one. 
Interestingly, as we show later, choosing one of the three as 
the “true" estimate does not affect our results -thanks to the 
scaling properties of power laws. 

Conflicts, such as the Mongolian Invasions, which we refer 
to as “named" conflicts, need to be treated with care from a sta¬ 
tistical point of view. Named conflicts are in fact artificial tags 
created by historians to aggregate events that share important 
historical, geographical and political characteristics, but that 
may have never really existed as a single event. Under the 
portmanteau Mongolian Invasions (or Conquests), historians 
collect all conflicts related to the expansion of the Mongol 
empire during the thirteenth and fourteenth centuries. Another 
example is the so-called Hundred-Years’ War in the period 
1337-1453. Aggregating all these events necessarily brings to 
the creation of very large fictitious conflicts accounting for 
hundreds of thousands or million casualties. The fact that, for 
historical and historiographical reasons, these events tends to 
be more present in antiquity and the Middle Ages could bring 
to a naive overestimation of the severity of wars in the past. 
Notice that named conflicts like the Mongolian Invasions are 
different from those like WWl or WW2, which naturally also 
involved several tens of battles in very different locations, but 
which took place in a much shorter time period, with no major 
time separation among conflicts. 

A straightforward technique could be to set a cutoff-point 
of 25 years for any single event -an 80 years event would 
be divided into three minor ones. However it remains that the 
length of the window remains arbitrary. Why 25 years? Why 
not 17? 


As we show in Section IV a solution to all these problems 


with data is to consider each single observation as an imprecise 
estimate, a fuzzy number in the definition of ll43l . Using Monte 
Carlo methods we have shown that, if we assume that the 
real number of casualties in a conflict is uniformly distributed 
between the minimum and the maximum in the available 
data, the tail exponent ^ is not really affected (apart from the 
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TABLE I: Excerpt of the data set of war casualties. The original data set contains 565 events in the period 1-2015 AD. Eor 
some events more than one estimate is available for casualties and we provide: the minimum (Min), the maximum (Max), and 
an intermediate one (Mid) according to historical sources. Casualties and world population estimates (Pop) in 10000. 


Event 

Start 

End 

Min 

Mid 

Max 

Pop 

Boudicca’s Revolt 

60 

61 

70 

7.52 

8.04 

19506 

Three Kingdoms 

220 

280 

3600 

3800 

4032 

20231 

An Lushan’s Reb. 

755 

763 

800 

1300 

3600 

24049 

Sicilian Vespers 

1282 

1302 


0.41 

0.80 

39240 

WWl 

1914 

1918 

1466 

1544 

1841 

177718 

WW2 

1939 

1945 

4823 

7300 

8500 

230735 


obvious differences in the smaller decimals)]^ Similarly, our 
results remain invariant if we remove/add a proportion of the 
observations by bootstrapping our sample and generating new 
data sets. 

We believe that this approach to data as imprecise obser¬ 
vations is one of the novelties of our work, which makes our 
conclusions more robust to scrutiny about the quality of data. 
We refer to Section 113 for more details. 


B. Missing events 

We are conhdent that there are many conflicts that are 
not part of our sample. Eor example we miss all conflicts 
among native populations in the American continent before 
its European discovery, as no source of information is actually 
available. Similarly we may miss some conflicts of antiquity 
in Europe, or in China in the sixteenth century. However, we 
can assume that the great part of these conflicts is not in the 
very tail of the distribution of casualties, say in the top 10 or 
20%. It is in fact not really plausible to assume that historians 
have not reported a conflict with 1 or 2 million casualties, or 
that such an event is not present in ll25l and ll29l . at least for 
what concerns Europe, Asia and Africa. 

Nevertheless, in Subsection VI we deal with the problem 
of missing data in more details. 


C. The conflict generator process 

We are not assuming the existence of a unique conflict 
generator process. It is clear that all conflicts of humanity 
do not share the same set of causes. Conflicts belonging to 
different centuries and continents are likely to be not only 
independent, as already underlined in and ifTSll . but also 
to have different origins. We thus avoid performing time 
series analysis beyond the straightforward investigation of the 
existence of trends. While it could make sense to subject the 
data to specific tests, such as whether an increase/decrease in 
war casualties is caused by the lethality of weapons, it would 
be unrigorous and unjustified to look for an autoregressive 
component. How could the An Lushan rebellion in China (755 
AD) depend on the Siege of Constantinople by Arabs (717 
AD), or affect the Viking Raids in Ireland (from 795 AD on)? 
But this does not mean that all conflicts are independent: dur¬ 
ing WW2, the attack on Pearl Harbor and the Battle of Erance 

'^Our results do not depend on choice of the uniform distribution, selected 
for simplification. All other bounded distributions produce the same results 
in the limit. 


were not independent, despite the time and spatial separation, 
and that’s why historians merge them into one single event. 
And while we can accept that most of the causes of WW2 are 
related to WWl, when studying numerical casualties, we avoid 
translating the dependence into the magnitude of the events: it 
would be naive to believe that the number of victims in 1944 
depended on the death toll in 1917. How could the magnitude 
of WW2 depend on WWl? 

Critically, when related conflicts are aggregated, as in the 
case of WW2 or for the Hundred Years’ war (1337-1453), 
our data show that the number of conflicts over time — if 
we focused on very destructive events in the last 500/600 
years and we believed in the existence of a conflict generator 
process — is likely to follow a homogeneous Poisson process, 
as already observed by ifTSll and ll^ . thus supporting the idea 
that wars are randomly distributed accidents over time, not 
following any particular trend. We refer to Section IV for more 
details. 


D. Data transformations 

We used three different types of data in our analyses: raw 
data, rescaled data and dual data. 

1) Raw data: Presented as collected from the different 
sources, and shown in Table |I] Let Xt be the number of 
casualties in a given conflict at time t, define the triplet 
{Xt, Xj., Xf }, where X^ and Xf represent the lower and 
upper bound, if available. 

2) Rescaled data: The data is rescaled with respect to 

the current world population (7.2 billior0). Define the triplet 
{Yt = Xt^f^,Yi = = Xf^f^ }, where 

732015 is the world population in 2015 and Pt is the population 
at time t = 1,..., 2015j^ Rescaled data was used by Steven 
Pinker m to account for the relative impact of a conflict. 
This rescaling tends to inflate past conflicts, and can lead 
to the naive statement that violence, if defined in terms of 
casualties, has declined over time. But we agree with those 
scholars like Robert Epstein ca stating “[...] why should we 
be content with only a relative decrease? By this logic, when 
we reach a world population of nine billion in 2050, Pinker 
will conceivably be satisfied if a mere two million people are 
killed in war that year". In this paper we used rescaled data 


^According to the United Nations Department of Economic and Social 
Affairs 1401 

®If for a given year, say 788 AD, we do not have a population estimate, 
we use the closest one. 
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for comparability reasons, and to show that even with rescaled 
data we cannot really state statistically that war casualties have 
declined over time (and thus violence). 

3) Dual data: those we obtain using the log-rescaling of 
equation (j^, the triplet {Zt = (p(Yt),Z\ = Lp{Yl),Zf = 
The input for the function ip{-) are the rescaled values 
Yf 

Removing the upper bound allows us to combine practical 
convenience with theoretical rigor. Since we want to apply the 
tools of extreme value theory, the finiteness of the upper bound 
is fundamental to decide whether a heavy-tailed distribution 
falls in the Weibull (finite upper bound) or in the Frechet 
class (infinite upper bound). Given their finite upper bound, 
war casualties are necessarily in the Weibull class. From a 
theoretical point of view the difference is large ii, especially 
for the existence of moments. But from a practical point of 
view, heavy-tailed phenomena are better modeled within the 
Frechet class. Power laws belong to this class. And as observed 
by mi, when a fat-tailed random variable Y has a very large 
finite upper bound H, unlikely to be ever touched (and thus 
observed in the data), its tail, which from a theoretical point 
of view falls in the so-called Weibull class, can be modeled 
as if belonging to the Frechet one. In other words, from a 
practical point of view, when H is very large, it is impossible 
to distinguish the two classes; and in most of the cases this is 
indeed not a problem. 

Problems arise when the tail looks so thick that even the 
first moment appears to be infinite, when we know that this 
is not possible. And this is exactly the case with out data, as 
we show in the next Section. 



Fig. 2: Graphical representation (Log-Log Plot) of what may 
happen if one ignores the existence of the finite upper bound 
H, since only M is observed. 

Figure shows illustrates the need to separate real and dual 
data. For a random variable Y with remote upper bound H, 
the real tail is represented by the continuous line. However, if 
we only observe values up to M, and we ignore the existence 
of H, which is unlikely to be reached and therefore hardly 
observable, we could be inclined to believe the the tail is 
the dotted one, the apparent one. The two tails are indeed 


essentially indistinguishable for most cases, but the divergence 
is evident when we approach H. Hence our transformation. 
Notice that ‘fi{y) ~ y for very large values of H. This means 
that for a very large upper bound, unlikely to be reached as 
the world population, the results we get for the tail of Y and 
Z = ip{Y) are essentially the same most of the times. This is 
exactly what happens with our data, considered that no armed 
conflict has ever killed more than 19% of the world population 
(the Three Kingdoms, 184-280 AD). But while Y is bounded, 
Z is not. 

Therefore we can safely model Z as belonging to the 
Frechet class, study its tail behavior, and then come back to 
Y when we are interested in the first moments, which under 
Z could not exist. 


111. Descriptive Data Analysis 

Figure showing casualties over time, is composed of four 
subhgures, two related to raw data and two related to dual 
dat^ For each type of data, using the radius of the different 
bubbles, we show the relative size of each event in terms of 
victims with respect to the world population and within our 
data set (what we call Impact). Note that the choice of the 
type of data (or of the dehnition of size) may lead to different 
interpretation of trends and patterns. From rescaled and dual 
data, one could be lead to superficially infer a decrease in the 
number of casualties over time. 

As to the number of armed conflicts. Figure [T] seems to 
suggest an increase over time, as we see that most events are 
concentrated in the last 500 years or so, an apparent illusionary 
increase most likely due to a reporting bias for the conflicts 
of antiquity and early Middle Ages. 

Figure shows the histogram of war casualties, when 
dealing with raw data. Similar results are obtained using 
rescaled and dual casualties. The graphs suggests the presence 
of a long right tail in the distribution of victims, and the 
maximum is represented by WW2, with an estimated amount 
of about 73 million victims. 

The presence of a Paretian tail is also supported by the 
quantile-quantile plot in Figure where we use the expo¬ 
nential distribution as a benchmark. The concave behavior we 
can observe in the plot is considered a good signal of fat-tailed 
distribution nn, lai. 

In order to investigate the presence of a right fat tail, we 
can make use of another graphical tool; the meplot, that is, 
mean excess function plot. 

Let X he a random variable with distribution F and right 
endpoint xp (i.e. xp = sup{a; G K : F{x) < 1}). The 
function 

. r , , f°°(f-u)dF(f) 

e{u) = E[X-u\X > u] = “ -, 0 < u < a;F, 

“ ( 2 ) 
is called mean excess function of X (mef). The empirical mef 
of a sample Xi, ^ 2 ,..., is easily computed as 


e„(u) 


Yh= 1 ^{Xi>u} ’ 


(3) 


^Given Equation Jlj, rescaled and dual data are approximately the same, 
hence there is no need to show further pictures. 
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Histogram of casualties 
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Fig. 3: Histogram of war casualties, using raw data. A long 
fat right tail is clearly visible. 


that is the sum of the exceedances over the threshold u divided 
by the number of such data points. Interestingly, the mean 
excess function is a way of characterizing distributions within 
the class of continuous distributions ED. For example, power 
law distributions are characterized by the van der Wijk’s law 
ED, that is, by a mean excess function linearly increasing in 
the threshold u, while an exponential distribution of parameter 
A would show a constant mean excess function with value A. 

Figure]^ shows the the meplot of war casualties, again using 
raw data. The graph is easily obtained by plotting the pairs 
{{Xi.,n,eniXi.,n)) : t = 1, ...,n}, where Xi.,^ is the i-th order 
statistic, used as a threshold. The upward trend in Figure is 
a further signal of fat-tailed data, as discussed for example in 
Q and CD. 



Fig. 4: Exponential qq-plot of war casualties, using raw data. 
The clear concave behavior suggest the presence of a right fat 
tail. 

A characteristic of the power law class is the non-existence 
(infiniteness) of moments, at least the higher-order ones, with 
important consequences in terms of statistical inference. If the 


2e+07 

Threshold 


Fig. 5: Mean Excess Eunction Plot (meplot) for war casualties, 
using raw data. A steep upward trend is visible, suggesting the 
presence of a Paretian right tail. 


variance is not finite, as is often the case with financial data, it 
becomes more problematic to build conhdence intervals for the 
mean. Similarly, if the third moment (skewness) is not defined, 
it is risky to build confidence intervals for the variance. 

An interesting graphical tool showing the behavior of mo¬ 
ments in a given sample, is the Maximum-to-Sum plot, or 
MS Plot. The MS Plot relies on simple consequence of the 
law of large numbers ED. Eor a sequence ATi,X 2 ,..., 
of nonnegative i.i.d. random variables, if for p = 1,2,3..., 
E[XP] < 00 , then i?P = M^/Sl 0 as n ^ 00 , 

where partial sum, and = 

max(Xf, ...,XP) the partial maximum. 


MSplot for p= 1 


MSplot for p= 2 
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Pig. 6; Maximum to Sum plot to verify the existence of the 
first four moments of the distribution of war casualties, using 
raw data. No clear converge to zero is observed, for p = 2,3,4, 
suggesting that the moments of order 2 or higher may not be 
hnite. 

Pigures and ?? show the MS Plots of war casualties for 
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MSplot for p= 1 



MSplot for p= 2 
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MSplot for p= 4 
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Fig. 7: Maximum to Sum plot to verify the existence of 
the hrst four moments of the distribution of war casualties, 
using rescaled data. No clear converge to zero is observed, 
suggesting that all the first four moments may not be hnite. 


raw and rescaled data respectively. No hnite moment appears 
to exist, no matter how much data is used: it is in fact clear 
that the ratio does not converge to 0 for p = 1, 2,3,4, thus 
suggesting that the distribution of war casualties has such a 
fat right tail that not even the hrst moment exists. This is 
particularly evident for rescaled data. 

To compare to thin tailed situations, |^shows the MS Plot of 
a Pareto {a = 1.5): notice that the hrst moment exists, while 
the other moments are not dehned, as the shape parameter 
equals 1.5. For thin tailed distributions such as the Normal, 
the MS Plot rapidly converges to 0 for all p. 



Fig. 8: Maximum to Sum plot of a Pareto(1.5): as expected, 
the hrst moment is hnite (i?„ —0), while higher moments do 
not exists (i?P is erratic and bounded from 0 for p = 2, 3,4). 


TABLE II: Average inter-arrival times (in years) and their 
mean absolute deviation for events with more than 0.5, 1, 2, 5, 
10, 20 and 50 million casualties, using raw (Ra) and rescaled 
data (Re). 


Thresh. 

Avg Ra 

MAD Ra 

Avg Re 

MAD Re 

0.5 

23.71 

35.20 

9.63 

15.91 

1 

34.08 

47.73 

13.28 

20.53 

2 

56.44 

72.84 

20.20 

28.65 

5 

93.03 

113.57 

34.88 

46.85 

10 

133.08 

136.88 

52.23 

63.91 

20 

247.14 

261.31 

73.12 

86.19 

50 

345.50 

325.50 

103.97 

114.25 


compute the distance, in terms of years, between two time- 
contiguous conhicts, and use for measure of dispersion the 
mean absolute deviation (from the mean). Table [II] shows the 
average inter-arrival times between armed conhicts with at 
least 0.5, 1, 2, 5, 10, 20 and 50 million casualties. For a conhict 
of at least 500k casualties, we need to wait on average 23.71 
years using raw data, and 9.63 years using rescaled or dual 
data (which, as we saw, tend to inhate the events of antiquity). 
For conhicts with at least 5 million casualties, the time delay 
is on average 93.03 or 34.88, depending on rescaling. Clearly, 
the bloodier the conhict, the longer the inter-arrival time. The 
results essentially do not change if we use the mid or the 
ending year of armed conhicts. 

The consequence of this analysis is that the absence of a 
conhict generating more than, say, 5 million casualties in the 
last sixty years highly insufficient to state that their probability 
has decreased over time, given that the average inter-arrival 
time is 93.03 years, with a mean absolute deviation of 113.57 
years! Unfortunately, we need to wait for more time to assert 
whether we are really living in a more peaceful era,: present 
data are not in favor (nor against) a structural change in 
violence, when we deal with war casualties. 

Section|I|asserted that our data set do not form a proper time 
series, in the sense that no real time dependence is present 
(apart from minor local exceptions)]^ Such consideration is 
further supported by the so-called record plot in Figure ]^ 
This graph is used to check the i.i.d. nature of data, and 
it relies on the fact that, if data were i.i.d., then records 
over time should follow a logarithmic pattern uni. Given a 
sequence Xi, 2 f 2 ,..., and dehned the partial maximum M^, 
an observation Xn is called a record if = Mn = Xn- 

For the sake of completeness, if we focus our attention on 
shorter periods, like the 50 years following WW2, a reduction 
in the number and the size of the conhict can probably be 
observed — the so-called Long Peace of ifTSll . and ROl : 
but recent studies suggest that this trend could already have 
changed in the last years ESI, EH- However, in our view, this 
type of analysis is not meaningful, once we account for the 


Table ]I^ provides information about armed conhicts and 
their occurrence over time. Note that we have chosen very 
large thresholds to minimize the risk of under-reporting. We 


®Even if redundant, given the nature of our data and the descriptive analysis 
performed so far, the absence of a relevant time dependence can be checked 
by performing a time series analysis of war casualties. No significant trend, 
season or temporal dependence can be observed over the entire time window. 
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Plot of Record Development 



Trials 


Fig. 9; Record plot for testing the plausibility of the i.i.d. 
hypothesis, which seems to be supported. 


extreme nature of armed conflicts, and the long inter-arrival 
times between them, as shown in Section m 


IV. Tail risk of armed conflicts 

Our data analysis in Section suggesting a heavy right 
tail for the distribution of war casualties, both for raw and 
rescaled data is consistent with the existing literature, e.g. 0, 
03, Ea and their references. Using extreme value theory, 
or EVT El, 0, 0, Ida, Ida, Ea, we use the Generalized 
Pareto Distribution (GPD). The choice is due to the Pickands, 
Balkema and de Haan’s Theorem m, ESI. 

Consider a random variable X with distribution function F, 
and call the conditional df of X above a given threshold 
u. We can then dehne the r.v. W, representing the rescaled 
excesses of X over the threshold u, so that 


Pu{w) = P{X — u < w\X > u) 


F{u + u;) — F{u) 
1 - F{u) 


for 0<w<xf — u, where is the right endpoint of the 
underlying distribution F. 

Pickands, Balkema and de Haan iII,E3 have showed that 
for a large class of underlying distribution functions F, and 
a large u, can be approximated by a Generalized Pareto 
distribution (hence GPD), i.e. Fu{w) —)■ G{w), as u —>■ oo 
where 


G{w) 


, w >0. 

1 — e~^ if ^ = 0 


(4) 


The parameter known as the shape parameter, is the central 
piece of information for our analysis: ^ governs the fatness of 
the tails, and thus the existence of moments. The moment of 
order p of a Generalized Pareto distributed random variable 
only exists if and only if c < i/p iia. 

The Pickands, Balkema and de Haan’s Theorem thus allows 
us to focus on the right tail of the distribution without caring 
too much about what happens below the threshold u. One 
powerful property of the generalized Pareto is the tail stability 


with respect to threshold ifTTl . Formally, if W ^ GPD(^, cti), 
for W > ui, then W ^ GPD(^, 0 - 2 ) for W > U 2 > ui. In 
other words: increasing the threshold does not affect the shape 
parameter. What changes is only the scale parameter. 

We start by identifying the threshold u above which the 
GPD approximation may hold. Different heuristic tools can 
be used for this purpose, from Zipf plots, i.e. log log plots of 
the survival function, to mean excess function plots (as the one 
in Figure |^, where one looks for the threshold above which 
it is possible to visualize — if present — the linearity that 
is typical of fat-tailed phenomena. Other possibilities like the 
Hill plot are discussed for example in ||9l. 

Our investigations suggest that the best threshold for the 
htness of GPD is 25, 000 casualties, when using raw data, 
well above the 3, 000 minimum we have set in collecting our 
observations. A total of 331 armed conflicts lie above this 
threshold (58.6% of all our data). 

The use of rescaled and dual data requires us to rescale the 
threshold; the value thus becomes 145k casualties. It is worth 
noticing that, nevertheless, for rescaled data the threshold 
could be lowered to 50k, and still we would have a satisfactory 
GPD htting. 

If ^ > —1/2, the parameters of a GPD can be easily 
estimated using Maximum Fikelihood Estimation (MFE) ||9| , 
while for ^ < —1/2 MFE estimates are not consistent and 
other methods are better used Ga¬ 
in our case, given the fat-tailed behavior, ^ is likely to 
be positive. This is clearly visible in Figure 10 showing the 
Pickands plot, based on the Pickands’ estimator for dehned 

as . ^ . 

^2k,n 


MP) ^ 

^k,n 


1 

log 2 


log 


Xk,n — ^9 


^2k,n i^4k,n 

where Alfc „ is the fc-th upper order statistics out of a sample 
of n observations. 



Fig. 10: Pickands’ estimator for the ^ shape parameter for 
war casualties (raw data). The value 1.5 appears to be a good 
educated guess. 

In the Pickands plot, / is computed for different values 
of k, and the "optimal" estimate of ^ is obtained by looking 
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TABLE III; Maximum likelihood estimates (and standard 
errors) of the Generalized Pareto Distribution parameters for 
casualties over different thresholds, using raw, rescaled and 
dual data. We also provide the number of events lying above 
the thresholds, the total number of observations being equal 
to 565. 


Data 

Threshold 

1 

O' 

Raw Data 

25k 

1.4985 

(0.1233) 

90620 

(2016) 

Rescaled Data 

145k 

1.5868 

(0.1265) 

497436 

(2097) 

Dual Data 

145k 

1.5915 

(0.1268) 

496668 

(1483) 


at a more or less stable value for ^ in the plot. In Figure 
[To] the value 1.5 seems to be a good educated guess for raw 
data, and similar results hold for rescaled and dual amounts. 
In any case, the important message is that ^ > 0, therefore we 
can safely use MLE. 

Table |In| presents our estimates of ^ and cr for raw, rescaled 
and dual data. In all cases, ^ is significantly greater than 1, and 
around 1.5 as we guessed by looking at Figure [TOj This means 
that in all cases the mean of the distribution of casualties 
seems to be inhnite, consistently with the descriptive analysis 
of Section m 

Further, Table jUIj shows the similarity between rescaled and 
dual data. Looking at the standard errors, no test would reject 
the null hypothesis that ^rescaled = iduaU or that ^rescaled — 


^dual • 

Figure 0 shows the goodness of our GPD fit for dual 
data, which is also supported by goodness-of-ht tests for the 
Generalized Pareto Distribution ll42ll (p-value: 0.37). As usual. 


similar results do hold for raw and rescaled data. Figure 12 


shows the residuals of the GPD ht which are, as can be 
expected (see ini), exponentially distributed. 


V. Estimating the shadow mean 

Given the hnite upper bound H, we known that E [F] must 
be finite as well. A simple idea is to estimate it by computing 
the sample mean, or the conditional mean above a given 
thresholcQ if we are more interested in the tail behavior of 
war casualties, as in this paper. For a minimum threshold L 
equal to 145k, this value is 1.77 x 10^ (remember that Y 
represents the rescaled data). 

However, in spite of its boundedness, the support of Y is so 
large that Y behaves like a very heavy-tailed distribution, at 
least until we do not approach the upper bound H. This makes 
the (conditional) sample mean not really reliable, or robust; 
one single extreme observation can make it jump, given that 
the (conditional) sample mean has a breakdown point equal to 

0 Ea. 

A more robust way of computing what we dehne the 
conditional shadow mean of Y (the true conditional mean not 

^Notice that the sample mean above a minimum threshold L corresponds 
to the concept of expected shortfall used in risk management. 



(a) Distribution of the excesses (ecfd and theoretical). 



5 10 60 100 500 1000 5000 10000 

X (on log scale) 


(b) Right Tail (empirical and theoretical). 

Fig. 11; GPD fitting of war casualties exceeding the 145k 
threshold, using dual data. 


visible from data) is therefore to use the log-transformation of 
Equation Q- 

Let F and G be respectively the distribution of Y and Z. 
With / and g, we indicate the densities. 

In the previous section we have seen that, when using the 
dual version of rescaled data, G « GPD(^,ct) for Z > L = 
145000 (even 50k gives good results, but 145k is comparable 
to the 25k threshold of raw data). Moreover we know that 
Z = ip{Y), so that Y = (p~^{Z) = {L — + H. 

This implies that the distribution of Y, for Y > L, can be 
obtained from the distribution of G. 

First we have 

^OO 

/ g{z)dz= f{y)dy. (5) 

JL JL 
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Ordering 


their ratio for different minimum thresholds L, using rescaled 
data. As already said, our GPD ht is already satisfactory for 
L = 50000, that is why we start from that threshold. 

The ratios in Table show how the sample mean under¬ 
estimates the shadow mean, especially for lower thresholds, 
for which the shadow mean is almost 3 times larger than the 
sample mean. For these reasons, a "journalistic" reliance on 
sample mean would be highly misleading, when considering 
all conflict together, without setting large thresholds. There 
would be the serious risk of underestimating the real lethality 
of armed conflicts. 

TABLE IV: Conditional shadow mean, conditional sample 
mean and their ratio for different minimum thresholds. In bold 
the values for the 145k threshold. Rescaled data. 


(a) Scatter plot of residuals and their smoothed average. 



(b) QQ-plot of residuals against exponential quantiles. 

Fig. 12: Residuals of the GPD htting: looking for exponen- 
tiality. 


And we know that 

= + , z€[L,oo). (6) 

This implies, setting a = and k = a/H, 

. - ( sr ^a k -—• 

(7) 

We can then derive the shadow mean of Y as 

= ^ y/(y)dy, (8) 

obtaining 

E[Y] =L + {H- L)e“'=(afc)“r (1 - a, ak). (9) 

Table |IV] provides the shadow mean, the sample mean and 


Thresh. X10^ 

Shadow X10'^ 

Sample X10 

Ratio 

50 

3.6790 

1.2753 

2.88 

100 

3.6840 

1.5171 

2.43 

145 

3.6885 

1.7710 

2.08 

300 

4.4089 

2.2639 

1.95 

500 

5.6542 

2.8776 

1.96 

1000 

6.5511 

3.9481 

1.85 


A similar procedure can be applied to raw data. Fixing H 
equal to 7.2 billion and L = 25000, we can dehne V = ip{X) 
and study its tail. The tail coefficient is 1.4970. Even in 
this case the sample mean underestimates the shadow mean 
by a factor of at least 1.6 (up to a maximum of 7.5!). 

These results tell us something already underlined by 112: 
we should not underestimate the risk associated to armed 
conflicts, especially today. While data do not support any 
signihcant reduction in human belligerence, we now have both 
the connectivity and the technologies to annihilate the entire 
world population (hnally touching that value H we have not 
touched so far). In the antiquity, it was highly improbable for 
a conflict involving the Romans to spread all over the world, 
thus also affecting the populations of the Americas. A conflict 
in ancient Italy could surely influence the populations in Gaul, 
but it could have no impact on people living in Australia at 
that time. Now, a conflict in the Middle East could in principle 
start the next World War, and no one could feel completely 
safe. 


It is important to stress that changes in the value of H do not 
effect the conclusions we can draw from data, when variations 
do not modify the order of magnitude of H. In other terms, 
the estimates we get for H — 7.2 billion are qualitatively 
consistent to what we get for all values of H in the range 
[6,9] billion. For example, for H = 9 billion, the conditional 
shadow mean of rescaled data, for L = 145000, becomes 
4.0384 X lO’^. This value is surely larger than 3.6885 x 10^, the 
one we And in Table IV but it is still in the vicinity of 4 x 10^, 
so that, from a qualitative point of view, our conclusions are 
still completely valid. 
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Other shadow moments 

As we did for the mean, we can use the transformed distri¬ 
bution to compute all moments, given that, since the support 
of Y is finite, all moments must be hnite. However, because of 
the wide range of variation of Y, and its heavy-tailed behavior, 
one needs to be careful in the interpretation of these moments. 
As discussed in ll38l . the simple standard deviation (or the 
variance) becomes unreliable when the support of Y allows 
for extremely fat tails (and one should prefer other measures of 
variability, like the mean absolute deviation). It goes without 
saying that for higher moments it is even worse. 

Deriving these moments can be cumbersome, both analyt¬ 
ically and numerically, but it is nevertheless possible. For 
example, if we are interested in the standard deviation of Y 
for L = 145000, we get 3.08 x 10®. This value is very large, 
three times the sample standard deviation, but hnite. Other 
values are shown in Table |V] In all cases, the shadow standard 
deviation is larger than the sample ones, as expected when 
dealing with fat-tails uni. As we stressed that care must be 
used in interpreting these quantities. 

Note that our dual approach can be generalized to all 
those cases in which an event can manifest very extreme 
values, without going to inhnity because of some physical or 
economical constraint. 

TABLE V; Shadow standard deviation, sample standard devia¬ 
tion and their ratio for different minimum thresholds. Rescaled 
data. 


Thresh. X10^ 

Shadow X10® 

Sample X10® 

Ratio 

50 

2.4388 

0.8574 

2.8444 

100 

2.6934 

0.9338 

2.8843 

145 

3.0788 

1.0120 

3.0422 

300 

3.3911 

1.1363 

2.9843 

500 

3.8848 

1.2774 

3.0412 

1000 

4.6639 

1.4885 

3.1332 


VI. Dealing with missing and imprecise data 


(by selecting across the various estimates shown in Table 
|I]) — the effect does not go beyond the smaller decimals. 
We can conclude that our estimates are remarkably robust to 
imprecisions and missing values in the datj^ 




1.0 1.5 2.0 2.5 3.0 3.5 4.0 

% 


An effective way of checking the robustness of our estimates 
to the "quality and reliability" of data is to use resampling 
techniques, which allow us to deal with non-precise m 
and possibly missing data. We have performed three different 
experiments: 

• Using the jackknife, we have created 100k samples, by 
randomly removing up to 10% of the observations lying 
above the 25k thresholds. In more than 99% of cases 
^ >> 1. As can be expected, the shape parameter ^ goes 
below the value 1 only if most of the observations we 
remove belong to the upper tail, like WWl or WW2. 

• Using the bootstrap with replacement, we have generated 


another set of 100k samples. As visible in Figure 13 


^ < 1 in a tiny minority of cases, less than 0.5%. All other 
estimates nicely distribute around the values of Table [II^ 
This is true for raw, rescaled and dual data. 

We tested for the effect of changes in such large events as 
WW2 owing to the inconsistencies we mentioned earlier 


(b) Rescaled data. 

Fig. 13: Distribution of the ^ estimate (thresholds 25k and 
145k casualties for raw and rescaled data respectively) ac¬ 
cording to 100k bootstrap samples. 


VII. Frequency oe armed conelicts 

Can we say something about the frequency of armed con¬ 
flicts over time? Can we observe some trend? In this section, 
we show that our data tend to support the findings of IfTSlI 
and ll32l . contra ll24l . ll^ . that armed conflict are likely to 
follow a homogeneous Poisson process, especially if we focus 
on events with large casualties. 

'^'That’s why, when dealing with fat tails, one should always prefer g to 
Other other quantities like the sample mean (m. And also use ^ in corrections, 
as we propose here. 
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The good GPD approximation allows us to use a well- 
known model of extreme value theory: the Peaks-over- 
Threshold, or POT. According to this approach, excesses over 
a high threshold follow a Generalized Pareto distribution, as 
stated by the Pickands, Balkema and de Haan’s Theorem 
m, EH, and the number of excesses over time follows 
a homogeneous Poisson process CD. If the last statement 
were verified for large armed conflicts, it would mean that 
no particular trend can be observed, i.e. that the propensity 
of humanity to generate big wars has neither decreased nor 
increased over time. 

In order to avoid problems with the armed conflicts of 
antiquity and possible missing data, we here restrict our 
attention on all events who took place after 1500 AD, i.e. 
in the last 515 years. As we have shown in Section 
missing data are unlikely to influence our estimates of the 
shape parameter but they surely have an impact on the 
number of observations in a given period. We do not want 
to state that we live in a more violent era, simply because we 
miss observations from the past. 

If large events, those above the 25k threshold for raw data 
(or the 145k one for rescaled amounts), follow a homogeneous 
Poisson process, in the period 1500-2015AD, their inter-arrival 
times need to be exponentially distributed. Moreover, no time 
dependence should be identified among inter-arrival times, for 
example when plotting an autocorrelogram (ACF). 

Figure shows that both of these characteristics are satis¬ 
factorily observable in our data set|^This is clearly visible in 
the QQ-plot of Subfigure |14a| where most inter-arrival times 
tend to cluster on the diagonal. 

Another way to test if large armed conflicts follow a 
homogeneous Poisson process is by considering the number 
of events in non-overlapping intervals. For a Poisson process, 
given a certain number of events in a time interval, the 
numbers of events in non-overlapping subintervals follow a 
Multinomial distribution. If the Poisson process is homoge¬ 
neous, then the Multinomial distribution is characterized by 
the same probability of falling in any of the sub-intervals, that 
is an equiprobability. It is not difficult to verify this with our 
data, and to see that we cannot reject the null hypothesis of 
equiprobable Multinomial distribution, over the period 1500- 
2015, in which 504 events took place, choosing a confidence 
level of 10%. 

Once again, the homogeneous Poisson behavior is verified 
for raw, rescaled and dual data. Regarding the estimates of 
if we restrict our attention on the period 1500-2015, we And 
1.4093 for raw data, and 1.4653 for rescaled amounts. 

To conclude this section, no particular trend in the number 
of armed conflicts can be traced. We are not saying that this 
is not possible over smaller time windows, but at least in the 
last 500 years humanity has shown to be as violent as usual. 

To conclude our paper, one may perhaps produce a con¬ 
vincing theory about better, more peaceful days ahead, but 
this cannot be stated on the basis of statistical analysis —this 
is not what the data allows us to say. Not very good news, we 

"when considering inter-arrival times, we deal with integers, as we only 
record the year of each event. 


have to admit. 



Ordered Data 


(a) Exponential QQ-plot of gaps (notice that gaps are expressed in 
years, so that they are discrete quantities). 



Lag 


(b) ACF of gaps. Notice that the first lag has order 0. 

Fig. 14: Two plots to verify whether armed conflicts (raw 
data) over the 25k casualties threshold follow a homogeneous 
Poisson Process 
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